Feature-Rich Error Detection in Scientific Writing Using Logistic Regression
نویسندگان
چکیده
The goal of the Automatic Evaluation of Scientific Writing (AESW) Shared Task 2016 is to identify sentences in scientific articles which need editing to improve their correctness and readability or to make them better fit within the genre at hand. We encode many different types of errors occurring in the dataset by linguistic features. We use logistic regression to assign a probability indicating whether a sentence needs to be edited. We participate in both tracks at AESW 2016: binary prediction and probabilistic estimation. In the former track, our model (HITS) gets the fifth place and in the latter one, it ranks first according to the evaluation metric.
منابع مشابه
Using Classic Discriminant Analysis and Detection Function for Separation of Chemical Victims in Sardasht City to Exposed and Non Exposure Mustard Groups in 2013 and Comparison with Logistic Regression
متن کامل
Phishing website detection using weighted feature line embedding
The aim of phishing is tracing the users' s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. M...
متن کاملFeature-Rich Two-Stage Logistic Regression for Monolingual Alignment
Monolingual alignment is the task of pairing semantically similar units from two pieces of text. We report a top-performing supervised aligner that operates on short text snippets. We employ a large feature set to (1) encode similarities among semantic units (words and named entities) in context, and (2) address cooperation and competition for alignment among units in the same snippet. These fe...
متن کاملFeature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملHidden logistic linear regression for support vector machine based phone verification
Phone verification approach to mispronunciation detection using a combination of Neural Network (NN) and Support Vector Machine (SVM) has been shown to yield improved verification performance. This approach uses a NN to predict the HMM state posterior probabilities. The average posterior probability vectors computed over each phone segment are used as input features to a SVM back-end to generat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016